AITopics

2509.1787

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine (1.00)
Transportation > Freight & Logistics Services (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Artificial IntelligenceMay-27-2025

Single-agent or Multi-agent Systems? Why Not Both?

Gao, Mingyan, Li, Yanzi, Liu, Banruo, Yu, Yifan, Wang, Phillip, Lin, Ching-Yu, Lai, Fan

Multi-agent systems (MAS) decompose complex tasks and delegate subtasks to different large language model (LLM) agents and tools. Prior studies have reported the superior accuracy performance of MAS across diverse domains, enabled by long-horizon context tracking and error correction through role-specific agents. However, the design and deployment of MAS incur higher complexity and runtime cost compared to single-agent systems (SAS). Meanwhile, frontier LLMs, such as OpenAI-o3 and Gemini-2.5-Pro, have rapidly advanced in long-context reasoning, memory retention, and tool usage, mitigating many limitations that originally motivated MAS designs. In this paper, we conduct an extensive empirical study comparing MAS and SAS across various popular agentic applications. We find that the benefits of MAS over SAS diminish as LLM capabilities improve, and we propose efficient mechanisms to pinpoint the error-prone agent in MAS. Furthermore, the performance discrepancy between MAS and SAS motivates our design of a hybrid agentic paradigm, request cascading between MAS and SAS, to improve both efficiency and capability. Our design improves accuracy by 1.1-12% while reducing deployment costs by up to 20% across various agentic applications.

large language model, machine learning, natural language, (21 more...)

2505.18286

Country: North America > United States > Illinois (0.27)

Genre:

Workflow (1.00)
Research Report > New Finding (0.67)
Research Report > Experimental Study (0.46)

Industry:

Leisure & Entertainment > Games (1.00)
Information Technology (0.67)
Health & Medicine (0.67)
Banking & Finance > Trading (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

arXiv.org Artificial IntelligenceMar-17-2025

Why Do Multi-Agent LLM Systems Fail?

Cemri, Mert, Pan, Melissa Z., Yang, Shuyi, Agrawal, Lakshya A., Chopra, Bhavya, Tiwari, Rishabh, Keutzer, Kurt, Parameswaran, Aditya, Klein, Dan, Ramchandran, Kannan, Zaharia, Matei, Gonzalez, Joseph E., Stoica, Ion

Despite growing enthusiasm for Multi-Agent Systems (MAS), where multiple LLM agents collaborate to accomplish tasks, their performance gains across popular benchmarks remain minimal compared to single-agent frameworks. This gap highlights the need to analyze the challenges hindering MAS effectiveness. In this paper, we present the first comprehensive study of MAS challenges. We analyze five popular MAS frameworks across over 150 tasks, involving six expert human annotators. We identify 14 unique failure modes and propose a comprehensive taxonomy applicable to various MAS frameworks. This taxonomy emerges iteratively from agreements among three expert annotators per study, achieving a Cohen's Kappa score of 0.88. These fine-grained failure modes are organized into 3 categories, (i) specification and system design failures, (ii) inter-agent misalignment, and (iii) task verification and termination. To support scalable evaluation, we integrate MASFT with LLM-as-a-Judge. We also explore if identified failures could be easily prevented by proposing two interventions: improved specification of agent roles and enhanced orchestration strategies. Our findings reveal that identified failures require more complex solutions, highlighting a clear roadmap for future research. We open-source our dataset and LLM annotator.

artificial intelligence, large language model, natural language, (15 more...)

2503.13657

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.67)
Information Technology (0.67)
Leisure & Entertainment > Games (0.46)
Media > Music (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Farhang, Sadegh, Hayes, William, Murphy, Nick, Neddenriep, Jonathan, Tyris, Nicholas

A Deep Learning Approach for Imbalanced Tabular Data in Advertiser Prospecting: A Case of Direct Mail Prospecting

arXiv.org Artificial IntelligenceOct-1-2024

Acquiring new customers is a vital process for growing businesses. Prospecting is the process of identifying and marketing to potential customers using methods ranging from online digital advertising, linear television, out of home, and direct mail. Despite the rapid growth in digital advertising (particularly social and search), research shows that direct mail remains one of the most effective ways to acquire new customers. However, there is a notable gap in the application of modern machine learning techniques within the direct mail space, which could significantly enhance targeting and personalization strategies. Methodologies deployed through direct mail are the focus of this paper. In this paper, we propose a supervised learning approach for identifying new customers, i.e., prospecting, which comprises how we define labels for our data and rank potential customers. The casting of prospecting to a supervised learning problem leads to imbalanced tabular data. The current state-of-the-art approach for tabular data is an ensemble of tree-based methods like random forest and XGBoost. We propose a deep learning framework for tabular imbalanced data. This framework is designed to tackle large imbalanced datasets with vast number of numerical and categorical features. Our framework comprises two components: an autoencoder and a feed-forward neural network. We demonstrate the effectiveness of our framework through a transparent real-world case study of prospecting in direct mail advertising. Our results show that our proposed deep learning framework outperforms the state of the art tree-based random forest approach when applied in the real-world.

customer, test 0, train 0, (15 more...)

2410.01157

Country:

North America > United States > California (0.14)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.05)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (0.86)

Industry:

Marketing (1.00)
Law (0.69)
Information Technology > Security & Privacy (0.68)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Billhardt, Holger, Fernández, Alberto, Ossowski, Sascha, Palanca, Javier, Bajo, Javier

Taxi dispatching strategies with compensations

arXiv.org Artificial IntelligenceJan-21-2024

Urban mobility efficiency is of utmost importance in big cities. Taxi vehicles are key elements in daily traffic activity. The advance of ICT and geo-positioning systems has given rise to new opportunities for improving the efficiency of taxi fleets in terms of waiting times of passengers, cost and time for drivers, traffic density, CO2 emissions, etc., by using more informed, intelligent dispatching. Still, the explicit spatial and temporal components, as well as the scale and, in particular, the dynamicity of the problem of pairing passengers and taxis in big towns, render traditional approaches for solving standard assignment problem useless for this purpose, and call for intelligent approximation strategies based on domain-specific heuristics. Furthermore, taxi drivers are often autonomous actors and may not agree to participate in assignments that, though globally efficient, may not be sufficently beneficial for them individually. This paper presents a new heuristic algorithm for taxi assignment to customers that considers taxi reassignments if this may lead to globally better solutions. In addition, as such new assignments may reduce the expected revenues of individual drivers, we propose an economic compensation scheme to make individually rational drivers agree to proposed modifications in their assigned clients. We carried out a set of experiments, where several commonly used assignment strategies are compared to three different instantiations of our heuristic algorithm. The results indicate that our proposal has the potential to reduce customer waiting times in fleets of autonomous taxis, while being also beneficial from an economic point of view.

assignment, customer, taxi, (16 more...)

doi: 10.1016/j.eswa.2019.01.001

2401.11553

Country:

Europe > Spain > Galicia > Madrid (0.05)
Europe > Spain > Valencian Community > Valencia Province > Valencia (0.04)
Europe > Czechia > Prague (0.04)
Asia > Middle East > Oman > Muscat Governorate > Muscat (0.04)

Genre: Research Report (0.64)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

#artificialintelligenceMar-30-2023, 18:36:55 GMT

It's not all about scores. Other criteria you should consider…

As a data scientist or machine learning engineer, you spend much of your time improving a model's performance by creating new features, comparing different types of models, trying out new model architectures, and much more. In the end, it's the score on the test set that counts, so that is what you focus on when deciding on a model. However, as important as the model performance may be, there are other, secondary criteria you shouldn't forget about. What do you get from a model with almost perfect scores, if your MLOps department can't host it? How does the user feel, if the prediction is accurate, but it takes ages to get it?

criteria, neural network, prediction, (16 more...)

Country:

South America (0.05)
Africa (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

WIREDOct-7-2022, 11:00:00 GMT

The Hottest Startups in Lisbon

Serial entrepreneurs Mila Suharev, Nils Henning, and Mitya Moskalchuk had been involved in the German startup scene for more than 15 years, successfully exiting four companies with valuations above €100 million (around $98.5 million) before deciding to launch their new startup in the Portuguese capital. "Lisbon has several ingredients making it a unique and efficient tech ecosystem," says Suharev, CEO of proptech company CASAFARI, listing factors such as quality of life, governmental programs designed to attract foreign entrepreneurs, and its Silicon Valley-like business mindset. Lisbon is increasingly becoming the tech hub of choice for many European entrepreneurs: Of the 10 CEOs profiled here, half are expats. "A new ecosystem such as the one growing in Lisbon is fascinating to experience firsthand," says Amir Bozorgzadeh, CEO of Virtualeap. "It is a melting pot of foreigners and Portuguese, working hand-in-hand amid a very sunny setting in which work-life balance is always a priority for founders."

artificial intelligence, lisbon, platform, (16 more...)

WIRED

Country:

Europe > Portugal > Lisbon > Lisbon (1.00)
North America > United States > California (0.25)
Europe > Spain (0.06)
(3 more...)

Industry:

Banking & Finance > Trading (1.00)
Health & Medicine (0.92)
Information Technology (0.90)
Banking & Finance > Real Estate (0.71)

Technology: Information Technology > Artificial Intelligence (0.71)

#artificialintelligenceOct-3-2022, 08:06:07 GMT

Bank Customer Churn Prediction Using Machine Learning

This article was published as a part of the Data Science Blogathon. Customer Churn prediction means knowing which customers are likely to leave or unsubscribe from your service. For many companies, this is an important prediction. This is because acquiring new customers often costs more than retaining existing ones. Once you've identified customers at risk of churn, you need to know exactly what marketing efforts you should make with each customer to maximize their likelihood of staying.

algorithm, customer, decision tree, (13 more...)

Industry: Banking & Finance (0.42)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)

#artificialintelligenceSep-13-2022, 10:45:56 GMT

Using PROC DEEPCAUSAL to optimize revenue through policy evaluation

When it comes to causal inference, scoring capability is particularly beneficial. It can be used in unique ways that result in an improved decision-making process, such as gaining optimal revenue using the least number of resources. In this post, I will introduce to you a new scoring capability and its use cases with PROC DEEPCAUSAL. I will also show you how it utilizes Deep Neural Networks (DNNs) to perform causal inference as well as policy evaluation and comparison. Inference is not valid for the estimators when the estimates from machine learning methods are directly plugged into an econometric model. This way creates highly biased estimators, so econometrics methods need to correct for this bias.

customer, proc deepcausal, revenue, (14 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.54)

#artificialintelligenceJun-22-2022, 00:50:23 GMT

Microsoft to restrict access to AI now deemed too risky

Microsoft has pledged to clamp down on access to AI tools designed to predict emotions, gender, and age from images, and will restrict the usage of its facial recognition and generative audio models in Azure. The Windows giant made the promise on Tuesday while also sharing its so-called Responsible AI Standard, a document [PDF] in which the US corporation vowed to minimize any harm inflicted by its machine-learning software. This pledge included assurances that the biz will assess the impact of its technologies, document models' data and capabilities, and enforce stricter use guidelines. This is needed because – and let's just check the notes here – there are apparently not enough laws yet regulating machine-learning technology use. Thus, in the absence of this legislation, Microsoft will just have to force itself to do the right thing.

customer, microsoft, software, (11 more...)

AI-Alerts: 2022 > 2022-06 > AAAI AI-Alert for Jun 22, 2022 (1.00)

Country: North America > United States (0.05)

Industry: Law > Statutes (0.73)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)